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ABSTRACT 


Many  environmental  sampling  problems  involve  some  specified  regulatory  or  contractual 
limit  (RL).  Often  the  interest  is  in  estimating  the  percentile  of  the  underlying  contaminant  concen¬ 
tration  distribution  corresponding  to  RL.  In  previous  reports,  we  have  discussed  the  problem  of 
determining  a  lower  100(l-a)%  confidence  limit  for  that  percentile  when  nQ  observations  are 
observable,  but  are  all  known  to  be  less  than  a  detection  limit  DL,  where  DL  <  RL.  In  this  report 
we  extend  those  results  to  the  situation  in  which  more  than  a  single  detection  limit  is  involved. 


1.  PROBLEM  DEFINITION 


Many  environmental  sampling  problems  involve  some  specified  regulatory  or  contractual 
limit  (RL).  Such  problems  exist  whether  sampling  air,  water,  soil,  or  living  organisms.  For 
example,  one  might  be  analyzing  air  samples  in  buildings  for  CO,  water  samples  from  lakes  for 
pesticides,  soil  samples  from  dump  sites  for  arsenic,  or  leaf  samples  from  trees  for  lead.  Often 
the  interest  is  in  estimating  pRL,  a  specified  percentile  of  the  underlying  contaminant  concentration 
distribution  corresponding  to  RL. 

The  problem  addressed  in  this  paper  is  the  estimation  of  the  desired  percentile  pRL  based 
on  a  sample  of  n  observations,  all  of  which  are  nondetectable,  where  an  observation  is  known 
only  to  be  less  than  some  detection  limit  DLj  <  RL.  That  is,  we  are  considering  a  sample  in  which 
ail  observations  are  censored.  We  will  assume,  of  course,  that  the  sample  is  a  representative 
sample. 

Given  n  observations,  each  known  to  be  less  than  RL,  a  binomial  lower  limit  on  pRL  is 
given  by: 

Prl  *  “1/n, 

where  (1-a)  is  the  desired  confidence  level.  This  lower  limit  makes  no  use  of  the  information  that 
the  observation  xi  is  less  than  DLj  (which  may  be  much  less  than  RL). 
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2.  BACKGROUND 


There  are  a  number  of  procedures  that  have  been  proposed  for  dealing  with  estimation 
problems  when  some  observations  in  a  set  of  data  are  censored,  and  reported  only  as  less  than  a 
detection  limit.  These  include,  for  example,  simple  substitution  methods,  maximum  likelihood 
estimation,  and  regression  methods. 

Haas  and  Scheff  [1]  and  Helsel  and  Gilliom  [2]  have  evaluated  the  performance  of  a 
number  of  suggested  approaches.  Any  of  the  methods  can  be  used  to  provide  an  estimate  of  pRL 
when  there  are  a  number  of  uncensored  observations  in  the  sample.  However,  none  can  be  used 
to  deal  with  the  problem  defined  in  the  previous  paragraphs. 

In  two  previous  reports  [4,5],  we  proposed  a  procedure  that  is  applicable  to  the  situation 
where  all  observations  are  left-censored  at  the  same  value  DL  <  RL.  Our  first  report  [4]  was 
based  on  the  assumption  of  an  underlying  lognormal  distribution.  That  is  the  usual  assumption 
for  contaminants  present  in  small  quantities. 

However,  there  are  some  cases  in  which  the  assumption  of  a  normal  distribution  may  be 
more  reasonable.  For  example,  if  the  cost  of  sampling  is  small  relative  to  the  cost  of  chemical 
analysis,  composite  samples  may  be  used.  Whatever  the  underlying  distribution  of  contaminant 
concentrations,  the  distribution  of  concentrations  in  the  composite  samples  will  tend  toward 
normality.  Therefore,  our  second  report  [5]  was  based  on  the  assumption  of  an  underlying  normal 
distribution.  This  report  extends  the  results  of  that  report  by  considering  the  situation  in  which 
multiple  detection  limits  are  involved. 
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3.  PROPOSED  PROCEDURE 


Given  a  sample  x  =  (x„  x2,  ...  xj  from  the  distribution  of  the  random  variable  X,  we 
want  a  lower  100(l-ct)%  confidence  limit  for  pRL  =  Pr{X  <  RL}.  It  is  assumed  that  X  is 
normally  distributed  and  that  each  observation  Xj  <  DLj  <  RL,  where  DL;  denotes  the  detection 
limit  for  the  ith  observation  and  RL  denotes  the  regulatory  limit  of  interest. 

The  usual  confidence  limit  for  a  percentile,  which  is  also  known  as  a  tolerance  limit,  is  of 
the  form  x  +  k*s,  where  x  and  s  are  the  sample  mean  and  standard  deviation,  respectively.  A 
tolerance  limit  p*  for  pRL  can  be  expressed  as: 

Pr{Pr(X  <  x  +  k«s)  >  p*}  =  (1-a). 

Of  course,  in  the  situation  we  are  considering,  none  of  the  xf  values  are  known. 

However,  since  larger  values  of  k  correspond  to  larger  values  of  pf ,  a  conservative  lower 
bound  for  pRL  can  be  found  by  minimizing  k  subject  to  the  restriction  that  x  +  k*s  =  RL.  That 
is,  we  want  to  minimize  k  =  (RL  -  x)/s  subject  to  the  constraints  0  <  x;  <  DLj  for  all  i.  This 
procedure  finds  the  worst-case  sample,  subject  to  the  constraints.  It  is  shown  in  the  next  section 
that  each  of  the  n  observations  in  this  worst-case  sample  is  either  equal  to  the  corresponding 
detection  limit  DL;  or  is  equal  to  zero. 

Given  k,  a  lower  bound  for  pRL  can  be  found  from  a  table  of  nonnal  tolerance  limits,  using 
the  desired  confidence  level.  If  the  required  software  is  available,  exact  values  can  be  obtained 
using  the  noncentral  t  distribution  function,  as  described  in  the  next  section.  An  extensive 
discussion  of  the  noncentral  t  distribution  and  its  use  in  computing  tolerance  limits  can  be  found 
in  [3]. 
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4.  MATHEMATICAL  DETAILS 


Our  objective  is  to  minimize,  subject  to  the  constraints  that  0  <  <  DLj  <  RL,  the  func¬ 

tion  (RL  -  x)/s.  Since  this  function  is  positive  in  this  interval,  this  is  equivalent  to  maximizing  its 
reciprocal.  For  analytical  convenience,  we  work  with  the  squared  reciprocal: 

f(x)  =  s2  =  £<xi  -  *)2 
(RL  -  x)2  (n-l)(RL  -  x)2 

Consider  die  partial  derivative  of  this  function  with  respect  to  an  individual  observation: 

3f(x)  =  (RL  -  x)(Xj  -  x)  +  (l/n)£(x,  -  x)2 
0Xj  .5(n-l)(RL  -  x)3 

Let  g(Xj)  denote  the  numerator  of  this  function. 

It  can  be  verified  that: 

g’(Xj)  =  (l/n)£(RL-Xi)  >  0, 

i^j 

so  g(Xj)  is  increasing.  Since  the  numerator  of  f'(Xj)  is  increasing  and  the  denominator  is 
decreasing,  f'(Xj)  must  be  increasing.  Therefore,  f(Xj)  is  maximized  either  at  zero  or  at  DLj. 

Now  consider  f(x)  as  a  function  of  Xj  and  Xj,  with  detection  limits  DLj  and  DLj, 
respectively.  Suppose  that  f(x)  is  maximized  when  Xj  =  0  and  Xj  =DLj.  Note  that  f(0,  DLj)  = 
f(DLj,  0).  Now,  if  DL;  >  DLj,  then  either: 

(1)  f(DLj,  0)  >  f(DLj,  0) 
or  (2)  f(0,  0)  >  f(DLj,  0), 

which  violates  the  assumption  that  f(0,  DLj)  is  a  maximum.  Therefore,  DL,  <  DLj. 

Thus,  the  procedure  to  be  followed  to  maximize  f(x)  is  to  sort  the  detection  limits  in 
ascending  order,  DL(1)  <  DL(2)  <...<  DL(n),  and  then  compute: 
f(DL(i),  DL(2),...,DL(n)), 
f(0,  DL(2),...,DL(ll)), 
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One  of  these  n  calculations  will  result  in  a  maximum  value  of  f(x). 


5.  SOME  RESULTS 


Following  [3],  the  tolerance  limit  equality  in  Section  3  can  be  reexpressed  as: 

Pr{Tn.!  <  k-n“  |  6}  =  (1-a), 

where  T„.,  has  a  noncentral  t  distribution  with  (n-1)  degrees  of  freedom  and  noncentrality 
parameter  5.  The  noncentrality  parameter  is  given  by: 

6  =  n^'^p*),  sop*  =  <h(6»n‘l/*), 

where  <F  denotes  the  standard  normal  distribution  function.  Therefore,  given  k,  n  and  the  desired 
confidence  level  (1-a),  one  can  search  for  5  and  solve  for  p*,  the  lower  bound  on  pRL. 

Our  previous  paper  [5]  provided  estimates  of  pRL,  given  by  pRL  =  3>(k),  and  lower  95% 
(a  =  .05)  bounds  for  pRL  for  various  sample  sizes  and  values  of  r  =  RL/DL.  Tables  1  and  2 
extend  these  results  by  considering  multiple  detection  limits.  Specifically,  the  tables  presents  the 
estimates  for  the  cases  where  20%,  50%,  and  80%  of  the  detection  limits  are  DL  and  the 
remaining  ones  are  .5DL.  Also  included  are  the  binomial  lower  95%  limits  on  pRL  and  the  case 
where  100%  of  the  detection  limits  are  DL,  which  were  presented  in  die  previous  paper. 

Note  that  the  procedure  addressed  in  this  paper  provides  point  estimates  of  pRL  in  each 
case,  which  the  binomial  approach  does  not  (except  for  the  uninformative  1.0).  Likewise, 
because  the  95%  confidence  bounds  do  use  the  information  given  by  the  detection  limits,  the 
procedure  performs  better  than  the  binomial  method,  except  for  situations  in  which  r  is  close  to 
1.0. 

It  appears  that  the  procedure  discussed  in  this  paper  should  prove  useful  in  many  cases 
where  a  sample  is  encountered  in  which  all  observations  are  less  than  detection  limits,  This  is 
particularly  true  for  larger  values  of  r  and  smaller  values  of  F. 


Sample 


Size 

F 

r  =  1.0 

r  =  1.5 

r  =  2.0 

r  =  2.5 

r  =  3.0 

10 

.20 

.932 

.997 

>.  9999 

>.  9999 

>.  9999 

.50 

.802 

.971 

.998 

.  9999 

>.9999 

.  80 

.802 

.  951 

.  996 

.  9999 

>.  9999 

1.00 

.624 

.  951 

.  996 

.9999 

>.  9999 

20 

.20 

.  937 

.  998 

>.  9999 

>.  9999 

>.9999 

.50 

.  809 

.  974 

.  998 

>.9999 

>.9999 

.  80 

.  675 

.  954 

.  997 

.  9999 

>.  9999 

1.00 

.588 

.954 

.  997 

.  9999 

>.  9999 

30 

.20 

.  939 

.998 

>.  9999 

>.  9999 

>.  9999 

.50 

.810 

.  975 

.998 

>.  9999 

>.  9999 

.  80 

.  676 

.  956 

.997 

.9999 

>.  9999 

1.00 

.572 

.956 

.  997 

.  9999 

>.  9999 

Table  1:  Estimated  Values  of  pRL  (r  =  RL/DL)  when  a  Fraction,  F,  of  the  Observations  in 

the  Sample  Have  Detection  Limit  DL  and  the  Remainder  Have  Detection  Limit 
.5DL 


Semple 

Size 

F 

r  =  1.0 

r  =  1.5 

r  =  2.0 

r  =  2.5 

r  =  3.0 

Binomial 

10 

.20 

.756 

.  941 

.  990 

.  999 

>.  9999 

.741 

.50 

.586 

.835 

.946 

.986 

.  975 

.741 

.80 

.455 

.791 

.  934 

.  984 

.  997 

.741 

1.00 

.411  • 

.791 

.  934 

.984 

.997 

.741 

20 

.20 

.834 

.977 

.  998 

>.9999 

>.  9999 

.861 

.50 

.  666 

.  903 

.  980 

.  997 

.  9998 

.861 

.80 

.525 

.  863 

.  973 

.  996 

.  9997 

.861 

1.00 

.440 

.  863 

.  973 

.996 

.  9997 

.861 

30 

.20 

.  861 

.  986 

.  999 

>.  9999 

>.  9999 

.905 

.50 

.  698 

.  925 

.  987 

.  9992 

>.  9999 

.  905 

.80 

.555 

.889 

.  982 

.  998 

.9999 

.905 

1.00 

.451 

.889 

.  982 

.998 

.  9999 

.  905 

Table  2:  Conservative  Lower  95%  Bounds  for  pRL  (r  =  RL/DL)when  a  Fraction,  F,  of  the 

Observations  in  the  Sample  Have  Detection  Limit  DL  and  the  Remainder  Have 
Detection  Limit  .5DL 
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